Belo Horizonte
Multiclass Graph-Based Large Margin Classifiers: Unified Approach for Support Vectors and Neural Networks
Hanriot, Vítor M., Torres, Luiz C. B., Braga, Antônio P.
While large margin classifiers are originally an outcome of an optimization framework, support vectors (SVs) can be obtained from geometric approaches. This article presents advances in the use of Gabriel graphs (GGs) in binary and multiclass classification problems. For Chipclass, a hyperparameter-less and optimization-less GG-based binary classifier, we discuss how activation functions and support edge (SE)-centered neurons affect the classification, proposing smoother functions and structural SV (SSV)-centered neurons to achieve margins with low probabilities and smoother classification contours. We extend the neural network architecture, which can be trained with backpropagation with a softmax function and a cross-entropy loss, or by solving a system of linear equations. A new subgraph-/distance-based membership function for graph regularization is also proposed, along with a new GG recomputation algorithm that is less computationally expensive than the standard approach. Experimental results with the Friedman test show that our method was better than previous GG-based classifiers and statistically equivalent to tree-based models.
- Europe > Portugal > Braga > Braga (0.41)
- South America > Brazil > Minas Gerais > Belo Horizonte (0.04)
- North America > United States > Wisconsin (0.04)
- (4 more...)
Hardware-Software Collaborative Computing of Photonic Spiking Reinforcement Learning for Robotic Continuous Control
Yu, Mengting, Xiang, Shuiying, Xie, Changjian, Chen, Yonghang, Zhao, Haowen, Guo, Xingxing, Zhang, Yahui, Han, Yanan, Hao, Yue
Robotic continuous control tasks impose stringent demands on the energy efficiency and latency of computing architectures due to their high-dimensional state spaces and real-time interaction requirements. Conventional electronic computing platforms face computational bottlenecks, whereas the fusion of photonic computing and spiking reinforcement learning (RL) offers a promising alternative. Here, we propose a novel computing architecture based on photonic spiking RL, which integrates the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm with spiking neural network (SNN). The proposed architecture employs an optical-electronic hybrid computing paradigm wherein a silicon photonic Mach-Zehnder interferometer (MZI) chip executes linear matrix computations, while nonlinear spiking activations are performed in the electronic domain. Experimental validation on the Pendulum-v1 and HalfCheetah-v2 benchmarks demonstrates the system capability for software-hardware co-inference, achieving a control policy reward of 5831 on HalfCheetah-v2, a 23.33% reduction in convergence steps, and an action deviation below 2.2%. Notably, this work represents the first application of a programmable MZI photonic computing chip to robotic continuous control tasks, attaining an energy efficiency of 1.39 TOPS/W and an ultralow computational latency of 120 ps. Such performance underscores the promise of photonic spiking RL for real-time decision-making in autonomous and industrial robotic systems.
- South America > Brazil > Minas Gerais > Belo Horizonte (0.04)
- Asia > Singapore (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Asia > China > Henan Province > Zhengzhou (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Architecture > Real Time Systems (1.00)
- Information Technology > Artificial Intelligence > Robots > Locomotion (0.94)
The Impact of Prosodic Segmentation on Speech Synthesis of Spontaneous Speech
Galdino, Julio Cesar, Leal, Sidney Evaldo, De Souza, Leticia Gabriella, Lima, Rodrigo de Freitas, Moreira, Antonio Nelson Fornari Mendes, Junior, Arnaldo Candido, Oliveira, Miguel Jr., Casanova, Edresson, Aluísio, Sandra M.
Spontaneous speech presents several challenges for speech synthesis, particularly in capturing the natural flow of conversation, including turn-taking, pauses, and disfluencies. Although speech synthesis systems have made significant progress in generating natural and intelligible speech, primarily through architectures that implicitly model prosodic features such as pitch, intensity, and duration, the construction of datasets with explicit prosodic segmentation and their impact on spontaneous speech synthesis remains largely unexplored. This paper evaluates the effects of manual and automatic prosodic segmentation annotations in Brazilian Portuguese on the quality of speech synthesized by a non-autoregressive model, FastSpeech 2. Experimental results show that training with prosodic segmentation produced slightly more intelligible and acoustically natural speech. While automatic segmentation tends to create more regular segments, manual prosodic segmentation introduces greater variability, which contributes to more natural prosody. Analysis of neutral declarative utterances showed that both training approaches reproduced the expected nuclear accent pattern, but the prosodic model aligned more closely with natural pre-nuclear contours. To support reproducibility and future research, all datasets, source codes, and trained models are publicly available under the CC BY-NC-ND 4.0 license.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Lift What You Can: Green Online Learning with Heterogeneous Ensembles
Köbschall, Kirsten, Buschjäger, Sebastian, Fischer, Raphael, Hartung, Lisa, Kramer, Stefan
Ensemble methods for stream mining necessitate managing multiple models and updating them as data distributions evolve. Considering the calls for more sustainability, established methods are however not sufficiently considerate of ensemble members' computational expenses and instead overly focus on predictive capabilities. To address these challenges and enable green online learning, we propose heterogeneous online ensembles (HEROS). For every training step, HEROS chooses a subset of models from a pool of models initialized with diverse hyperparameter choices under resource constraints to train. We introduce a Markov decision process to theoretically capture the trade-offs between predictive performance and sustainability constraints. Based on this framework, we present different policies for choosing which models to train on incoming data. Most notably, we propose the novel $ζ$-policy, which focuses on training near-optimal models at reduced costs. Using a stochastic model, we theoretically prove that our $ζ$-policy achieves near optimal performance while using fewer resources compared to the best performing policy. In our experiments across 11 benchmark datasets, we find empiric evidence that our $ζ$-policy is a strong contribution to the state-of-the-art, demonstrating highly accurate performance, in some cases even outperforming competitors, and simultaneously being much more resource-friendly.
- Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
- Europe > Switzerland (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (14 more...)
- Education > Educational Setting > Online (1.00)
- Government (0.93)
A short methodological review on social robot navigation benchmarking
Chhetri, Pranup, Torrejon, Alejandro, Eslava, Sergio, Manso, Luis J.
Social Robot Navigation is the skill that allows robots to move efficiently in human-populated environments while ensuring safety, comfort, and trust. Unlike other areas of research, the scientific community has not yet achieved an agreement on how Social Robot Navigation should be benchmarked. This is notably important, as the lack of a de facto standard to benchmark Social Robot Navigation can hinder the progress of the field and may lead to contradicting conclusions. Motivated by this gap, we contribute with a short review focused exclusively on benchmarking trends in the period from January 2020 to July 2025. Of the 130 papers identified by our search using IEEE Xplore, we analysed the 85 papers that met the criteria of the review. This review addresses the metrics used in the literature for benchmarking purposes, the algorithms employed in such benchmarks, the use of human surveys for benchmarking, and how conclusions are drawn from the benchmarking results, when applicable.
- North America > United States > Michigan > Wayne County > Detroit (0.05)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (33 more...)
- Transportation (0.46)
- Health & Medicine (0.46)
UrbanFusion: Stochastic Multimodal Fusion for Contrastive Learning of Robust Spatial Representations
Mühlematter, Dominik J., Che, Lin, Hong, Ye, Raubal, Martin, Wiedemann, Nina
Forecasting urban phenomena such as housing prices and public health indicators requires the effective integration of various geospatial data. Current methods primarily utilize task-specific models, while recent foundation models for spatial representations often support only limited modalities and lack multimodal fusion capabilities. To overcome these challenges, we present UrbanFusion, a Geo-Foundation Model (GeoFM) that features Stochastic Multimodal Fusion (SMF). The framework employs modality-specific encoders to process different types of inputs, including street view imagery, remote sensing data, cartographic maps, and points of interest (POIs) data. These multimodal inputs are integrated via a Transformer-based fusion module that learns unified representations. An extensive evaluation across 41 tasks in 56 cities worldwide demonstrates UrbanFusion's strong generalization and predictive performance compared to state-of-the-art GeoAI models. Specifically, it 1) outperforms prior foundation models on location-encoding, 2) allows multimodal input during inference, and 3) generalizes well to regions unseen during training. UrbanFusion can flexibly utilize any subset of available modalities for a given location during both pretraining and inference, enabling broad applicability across diverse data availability scenarios. All source code is available at https://github.com/DominikM198/UrbanFusion.
- North America > United States > District of Columbia > Washington (0.14)
- Europe > United Kingdom (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- (39 more...)
- Banking & Finance > Real Estate (0.66)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)
- Transportation > Ground > Road (0.45)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)
"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations
Zuin, Gianlucca, Veloso, Adriano
Creating models from past observations and ensuring their effectiveness on new data is the essence of machine learning. However, selecting models that generalize well remains a challenging task. Related to this topic, the Rashomon Effect refers to cases where multiple models perform similarly well for a given learning problem. This often occurs in real-world scenarios, like the manufacturing process or medical diagnosis, where diverse patterns in data lead to multiple high-performing solutions. We propose the Rashomon Ensemble, a method that strategically selects models from these diverse high-performing solutions to improve generalization. By grouping models based on both their performance and explanations, we construct ensembles that maximize diversity while maintaining predictive accuracy. This selection ensures that each model covers a distinct region of the solution space, making the ensemble more robust to distribution shifts and variations in unseen data. We validate our approach on both open and proprietary collaborative real-world datasets, demonstrating up to 0.20+ AUROC improvements in scenarios where the Rashomon ratio is large. Additionally, we demonstrate tangible benefits for businesses in various real-world applications, highlighting the robustness, practicality, and effectiveness of our approach.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > China > Hubei Province > Wuhan (0.04)
- South America > Brazil > São Paulo (0.04)
- (18 more...)
- Research Report > New Finding (1.00)
- Research Report > Promising Solution (0.67)
Detection of Chagas Disease from the ECG: The George B. Moody PhysioNet Challenge 2025
Reyna, Matthew A., Koscova, Zuzana, Pavlus, Jan, Saghafi, Soheil, Weigle, James, Elola, Andoni, Seyedi, Salman, Campbell, Kiersten, Li, Qiao, Rad, Ali Bahrami, Ribeiro, Antônio H., Ribeiro, Antonio Luiz P., Sameni, Reza, Clifford, Gari D.
Objective: Chagas disease is a parasitic infection that is endemic to South America, Central America, and, more recently, the U.S., primarily transmitted by insects. Chronic Chagas disease can cause cardiovascular diseases and digestive problems. Serological testing capacities for Chagas disease are limited, but Chagas cardiomyopathy often manifests in ECGs, providing an opportunity to prioritize patients for testing and treatment. Approach: The George B. Moody PhysioNet Challenge 2025 invites teams to develop algorithmic approaches for identifying Chagas disease from electrocardiograms (ECGs). Main results: This Challenge provides multiple innovations. First, we leveraged several datasets with labels from patient reports and serological testing, provided a large dataset with weak labels and smaller datasets with strong labels. Second, we augmented the data to support model robustness and generalizability to unseen data sources. Third, we applied an evaluation metric that captured the local serological testing capacity for Chagas disease to frame the machine learning problem as a triage task. Significance: Over 630 participants from 111 teams submitted over 1300 entries during the Challenge, representing diverse approaches from academia and industry worldwide.
- North America > Central America (0.24)
- South America > Brazil > Minas Gerais > Belo Horizonte (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (3 more...)
Neglected Risks: The Disturbing Reality of Children's Images in Datasets and the Urgent Call for Accountability
Caetano, Carlos, Santos, Gabriel O. dos, Petrucci, Caio, Barros, Artur, Laranjeira, Camila, Ribeiro, Leo S. F., de Mendonça, Júlia F., Santos, Jefersson A. dos, Avila, Sandra
Including children's images in datasets has raised ethical concerns, particularly regarding privacy, consent, data protection, and accountability. These datasets, often built by scraping publicly available images from the Internet, can expose children to risks such as exploitation, profiling, and tracking. Despite the growing recognition of these issues, approaches for addressing them remain limited. We explore the ethical implications of using children's images in AI datasets and propose a pipeline to detect and remove such images. As a use case, we built the pipeline on a Vision-Language Model under the Visual Question Answering task and tested it on the #PraCegoVer dataset. We also evaluate the pipeline on a subset of 100,000 images from the Open Images V7 dataset to assess its effectiveness in detecting and removing images of children. The pipeline serves as a baseline for future research, providing a starting point for more comprehensive tools and methodologies. While we leverage existing models trained on potentially problematic data, our goal is to expose and address this issue. We do not advocate for training or deploying such models, but instead call for urgent community reflection and action to protect children's rights. Ultimately, we aim to encourage the research community to exercise - more than an additional - care in creating new datasets and to inspire the development of tools to protect the fundamental rights of vulnerable groups, particularly children.
- South America > Brazil > São Paulo > Campinas (0.04)
- Oceania > Australia (0.04)
- South America > Brazil > Minas Gerais > Belo Horizonte (0.04)
- (3 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
Generative AI as a catalyst for democratic Innovation: Enhancing citizen engagement in participatory budgeting
Sousa, Italo Alberto do Nascimento, Machado, Jorge, Vaz, Jose Carlos
This research examines the role of Generative Artificial Intelligence (AI) in enhancing citizen engagement in participatory budgeting. In response to challenges like declining civic participation and increased societal polarization, the study explores how online political participation can strengthen democracy and promote social equity. By integrating Generative AI into public consultation platforms, the research aims to improve citizen proposal formulation and foster effective dialogue between citizens and government. It assesses the capacities governments need to implement AI-enhanced participatory tools, considering technological dependencies and vulnerabilities. Analyzing technological structures, actors, interests, and strategies, the study contributes to understanding how technological advancements can reshape participatory institutions to better facilitate citizen involvement. Ultimately, the research highlights how Generative AI can transform participatory institutions, promoting inclusive, democratic engagement and empowering citizens.
- Europe > Spain > Galicia > Madrid (0.05)
- South America > Brazil > Minas Gerais > Belo Horizonte (0.05)
- South America > Brazil > Rio Grande do Sul > Porto Alegre (0.05)
- (10 more...)
- Research Report > Experimental Study (0.66)
- Research Report > New Finding (0.46)
- Law (0.94)
- Government > E-government (0.46)
- Materials > Chemicals > Specialty Chemicals (0.41)